The methods of cluster analysis can be applied to the process of segmenting an image by assigning to each pixel a label indicating some form of shared characteristic such as boundaries or colours. The goal is to identify important regions of an image where each segment belongs to one particular surface, object, etc. There are applications in image processing/compression, object recognition, database searching, and edge detection for feature extraction.
The purpose of this project is to go over in detail the process of segmenting an image. The image involved will determine which frameworks are most effective in achieving a good segmentation of the image. We will look at the K-means algorithm, which performs well when objects in an image are well separated in a colour space; and we will look at a more advanced procedure for the task of segmenting soft tissues present in medical images of the brain.
The Imager package cotains some example images to work with. We will import the parrots file as it can highlight the ease with which algorithms can segment within a 3 dimensional color space (RGB).
file <- system.file('extdata/parrots.png',package='imager')
parrots <- load.image(file)
parrots## Image. Width: 768 pix Height: 512 pix Depth: 1 Colour channels: 3
The data is stored as a cimg class, which behaves identically as an array. For purposes of plotting with the ggplot2 package, which requires data be stored as a dataframe we convert the default cmig object.
parrots_df <- as.data.frame(parrots,wide="c") %>% # wide format with channels as columns
mutate(rgb.val=rgb(c.1,c.2,c.3))
names <- c("x", "y", "R", "G", "B", "rgb.val")
colnames(parrots_df) <- names| x | y | R | G | B | rgb.val |
|---|---|---|---|---|---|
| 1 | 1 | 0.4549020 | 0.4549020 | 0.3450980 | #747458 |
| 2 | 1 | 0.4588235 | 0.4588235 | 0.3490196 | #757559 |
| 3 | 1 | 0.4705882 | 0.4705882 | 0.3607843 | #78785C |
The image has a wide colour palette that we hope will serve as the basis of our segmentation. Without using spacial information, each pixel has a corresponding 3-dimentional vector of color channels – red, green, and blue. The K-means algorithm should perform well here since the colours help to differentiate the parrots from each other and the background (i.e. good separation).
The K-means method is a simple solution to the problem of finding clusters centers that minimize the variance within each cluster. Each cluster in our problem refers to a specific colour value. Consequently, a pixel is considered to be “like” another if they are near each other in the colour space. Euclidean distance will define the notion of “nearness”. K-means requires that we specify the number of clusters a-priori. Determining the optimal number of clusters depends on the problem at hand. The goal of segmenting an image is only to find the clusters/colours that we deem most important in separating the objects. The salient colours in the parrots image are red, green, turquoise, yellow, white, and black. So we might choose the number of clusters to be around 6-8. Next, we need to choose a method of initalization for the pixel centers. The K-means++ algorithm is a seeding procedure for the centers which results in an improvement of the final error and a decrease in computation time. Intuitively, initial centers should be chosen such that they are not too close together. This idea is implemented in the function below.
# rows are centers
centers <- function(x, k) {
x <- as.matrix(x)
center.mat <- matrix(0, nrow=k,ncol=ncol(x))
center.init <- x[sample(nrow(x), size=1), ]
center.mat[1, ] <- center.init
for (i in 2:k) {
e.dist <- sqrt(rowSums((x-center.init)^2)) # compute euclidean distance
probs <- e.dist^2 / (max(e.dist)^2 + 1) # weighted probability
center.new <- x[sample(nrow(x), size=1, prob=probs), ]
center.mat[i, ] <- center.new
}
center.mat
}Although this initialization process does require some computational effort, the K-means procedure should converge more quickly on average. Now we will use this in conjunction with R’s built in kmeans function.
kmpp <- function(x, k) {
centers <- centers(x, k)
kmeans(x, centers)
}cl <- kmpp(parrots_df[, c("R", "G", "B")], k=8)
kColours <- rgb(cl$centers[cl$cluster, ])A magnetic resonance imaging (MRI) scan of the brain presents an important application of image-based segmentation. The three main tissue classes that are normally segmentated in a scan are white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF). Facilitating segmentation requires high contrasts between these tissue types. To generate improved tissue contrasts requires varying parameters of the pulse sequences responsible for causing the hydrogen atoms contained in the body to emit radio frequencies when subjected to a magnetic field. The contrasts achieved through MRI scans is an improvement over computed tomography. However, results still suffer from the following obstacles to segmentation
The paper Segmentation of Brain MR Images Through a Hidden Markov Random Field Model and the Expectation-Maximization Algorith by Yongyue Zhang et al. provides justification for a Hidden Markov Random Field (HMRF) model together with an Expectation Maximization (EM) method of iterative parameter updating, for a framework that effectively overcomes these difficulties in segmentation.
In this project we adopt this framework to segment a single 2-D slice taken from an MRI scan obtained from Neuromorphometrics, Inc.. The success of this approach offers a means for automatic brain MR image segmentation.
The Neuroimaging Informatics Technology Initiative (NIfTI) format is an improvement on the ANALYZE format, retaining a header/image combination of data but also allowing for storage of auxillary information inside the file. The readMRI function from the mritc package can be used to obtain an array with the appropriate image dimensions.
# Read in NIfTI data
# data <- readMRI("NIFTI/1103_3.nii", format="nifti")
img <- readNIfTI("NIFTI/1103_3.nii")
# See all attributes of the NIfTI object
img## NIfTI-1 format
## Type : nifti
## Data Type : 512 (UINT16)
## Bits per Pixel : 16
## Slice Code : 0 (Unknown)
## Intent Code : 0 (None)
## Qform Code : 1 (Scanner_Anat)
## Sform Code : 0 (Unknown)
## Dimension : 256 x 256 x 332
## Pixel Dimension : 1 x 1 x 1
## Voxel Units : mm
## Time Units : Unknown
First we need to access the data in the object. Next we want to select particular slice ranging over the z-values of the three dimensional space and visualize a transverse segment.
data = slot(img, ".Data")
class(data)## [1] "array"
slice = data[,,140] # where z=140The two dimensional image has pixel dimensions:
dim(slice)## [1] 256 256
Transverse slice
To make inferences on the data we consider an image graph denoted \(\mathcal{G} = (\mathcal{V}, \mathcal{E})\), where \(\mathcal{V}\) is the set of nodes corresponding to the pixels (or groups of pixels) making up the image, and \(\mathcal{E}\) corresponds to undirected edges. As mentioned earlier, the set of hidden classes attributable to each node is \(\mathcal{S} =\{WM, GM, CSF \}\), which are to be estimated from the pixel intensities (observations). To incorperate the greater likelihood of neighbouring pixels to share a class label, an edge \((i,j) \in \mathcal{E}\) must be accompanied by a bias that encodes such a tendency between pixels. A Markov model accounts only for the dependencies between pixels sharing a common edge. That is, for pixel \(i\) define the set of its neighbours to be \(\mathcal{N_i} = \{j: (i,j) \in \mathcal{E} \}\).
The sequence of pixel labels, denoted \(\{X_1, \dots, X_{256^2}\}\) is said to be a Markov Random Field (MRF) on the state space \(\mathcal{S}\) with respect to \(\mathcal{N}\), if and only if it has the following properties:
The Hammersley-Clifford theorem states that a MRF with the properties listed above is equivalently a Gibbs random field – that is, we may express the MRF as a Gibbs distribution: \[ P(\mathbf{x}) = \frac{1}{Z}exp\bigg\{-\sum_{c\in\mathcal{C}}\Psi_c(\mathbf{x}) \bigg\} \] where \(Z\) represents the normalizing constant (or partition function) to make the distribution sum to unity, and \(E(\mathbf{x}) = \sum_{c \in \mathcal{C}}\Psi_c(\mathbf{x})\) is the cost function for the learning stage. Here \(\mathcal{C}\) denotes the collection of cliques in the graph \(\mathcal{G}\). A clique \(c\) is a subgraph of \(\mathcal{G}\) that is fully connected – meaning each distinct pair of pixel sites (\(i, j \, : i\neq j\)) are neighbouring (\(i \in \mathcal{N_j} \wedge j \in \mathcal{N_i}\)).
The value of the clique potential \(\Psi_c(\mathbf{x})\) should decrease \(E(\mathbf{x})\) when two adjacent pixels have similar intensities, and thus increase the joint probability \(P(\mathbf{x})\).
A work by Derek Wayne
derekwayne4@gmail.com